Goto

Collaborating Authors

 uni 0


Bayesian Adaptive Polynomial Chaos Expansions

Rumsey, Kellin N., Francom, Devin, Gibson, Graham C., Tucker, J. Derek, Huerta, Gabriel

arXiv.org Machine Learning

Polynomial chaos expansions (PCE) are widely used for uncertainty quantification (UQ) tasks, particularly in the applied mathematics community. However, PCE has received comparatively less attention in the statistics literature, and fully Bayesian formulations remain rare--especially with implementations in R. Motivated by the success of adaptive Bayesian machine learning models such as BART, BASS, and BPPR, we develop a new fully Bayesian adaptive PCE method with an efficient and accessible R implementation: khaos. Our approach includes a novel proposal distribution that enables data-driven interaction selection, and supports a modified g-prior tailored to PCE structure. Through simulation studies and real-world UQ applications, we demonstrate that Bayesian adaptive PCE provides competitive performance for surrogate modeling, global sensitivity analysis, and ordinal regression tasks.


Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series

Chang, Ching, Hwang, Jeehyun, Shi, Yidan, Wang, Haixin, Peng, Wen-Chih, Chen, Tien-Fu, Wang, Wei

arXiv.org Artificial Intelligence

Time series data in real-world applications such as healthcare, climate modeling, and finance are often irregular, multimodal, and messy, with varying sampling rates, asynchronous modalities, and pervasive missingness. However, existing benchmarks typically assume clean, regularly sampled, unimodal data, creating a significant gap between research and real-world deployment. We introduce Time-IMM, a dataset specifically designed to capture cause-driven irregularity in multimodal multivariate time series. Time-IMM represents nine distinct types of time series irregularity, categorized into trigger-based, constraint-based, and artifact-based mechanisms. Complementing the dataset, we introduce IMM-TSF, a benchmark library for forecasting on irregular multimodal time series, enabling asynchronous integration and realistic evaluation. IMM-TSF includes specialized fusion modules, including a timestamp-to-text fusion module and a multimodality fusion module, which support both recency-aware averaging and attention-based integration strategies. Empirical results demonstrate that explicitly modeling multimodality on irregular time series data leads to substantial gains in forecasting performance. Time-IMM and IMM-TSF provide a foundation for advancing time series analysis under real-world conditions. The dataset is publicly available at https://github.com/blacksnail789521/Time-IMM, and the benchmark library can be accessed at https://github.com/blacksnail789521/IMM-TSF. Project page: https://blacksnail789521.github.io/time-imm-project-page/


A Clinical-grade Universal Foundation Model for Intraoperative Pathology

Zhao, Zihan, Zhou, Fengtao, Li, Ronggang, Chu, Bing, Zhang, Xinke, Zheng, Xueyi, Zheng, Ke, Wen, Xiaobo, Ma, Jiabo, Wang, Yihui, Chen, Jiewei, Zheng, Chengyou, Zhang, Jiangyu, Wen, Yongqin, Meng, Jiajia, Zeng, Ziqi, Li, Xiaoqing, Li, Jing, Xie, Dan, Ye, Yaping, Wang, Yu, Chen, Hao, Cai, Muyan

arXiv.org Artificial Intelligence

Intraoperative pathology is pivotal to precision surgery, yet its clinical impact is constrained by diagnostic complexity and the limited availability of high-quality frozen-section data. While computational pathology has made significant strides, the lack of large-scale, prospective validation has impeded its routine adoption in surgical workflows. Here, we introduce CRISP, a clinical-grade foundation model developed on over 100,000 frozen sections from eight medical centers, specifically designed to provide Clinical-grade Robust Intraoperative Support for Pathology (CRISP). CRISP was comprehensively evaluated on more than 15,000 intraoperative slides across nearly 100 retrospective diagnostic tasks, including benign-malignant discrimination, key intraoperative decision-making, and pan-cancer detection, etc. The model demonstrated robust generalization across diverse institutions, tumor types, and anatomical sites-including previously unseen sites and rare cancers. In a prospective cohort of over 2,000 patients, CRISP sustained high diagnostic accuracy under real-world conditions, directly informing surgical decisions in 92.6% of cases. Human-AI collaboration further reduced diagnostic workload by 35%, avoided 105 ancillary tests and enhanced detection of micrometastases with 87.5% accuracy. Together, these findings position CRISP as a clinical-grade paradigm for AI-driven intraoperative pathology, bridging computational advances with surgical precision and accelerating the translation of artificial intelligence into routine clinical practice.




Anomaly Detection in Smart Power Grids with Graph-Regularized MS-SVDD: a Multimodal Subspace Learning Approach

Debelle, Thomas, Sohrab, Fahad, Abrahamsson, Pekka, Gabbouj, Moncef

arXiv.org Artificial Intelligence

In this paper, we address an anomaly detection problem in smart power grids using Multimodal Subspace Support Vector Data Description (MS-SVDD). This approach aims to leverage better feature relations by considering the data as coming from different modalities. These data are projected into a shared lower-dimensionality subspace which aims to preserve their inner characteristics. To supplement the previous work on this subject, we introduce novel multimodal graph-embedded regularizers that leverage graph information for every modality to enhance the training process, and we consider an improved training equation that allows us to maximize or minimize each modality according to the specified criteria. We apply this regularized graph-embedded model on a 3-modalities dataset after having generalized MS-SVDD algorithms to any number of modalities. To set up our application, we propose a whole preprocessing procedure to extract One-Class Classification training instances from time-bounded event time series that are used to evaluate both the reliability and earliness of our model for Event Detection.


NLP Sampling: Combining MCMC and NLP Methods for Diverse Constrained Sampling

Toussaint, Marc, Braun, Cornelius V., Ortiz-Haro, Joaquim

arXiv.org Artificial Intelligence

Generating diverse samples under hard constraints is a core challenge in many areas. With this work we aim to provide an integrative view and framework to combine methods from the fields of MCMC, constrained optimization, as well as robotics, and gain insights in their strengths from empirical evaluations. We propose NLP Sampling as a general problem formulation, propose a family of restarting two-phase methods as a framework to integrated methods from across the fields, and evaluate them on analytical and robotic manipulation planning problems. Complementary to this, we provide several conceptual discussions, e.g. on the role of Lagrange parameters, global sampling, and the idea of a Diffused NLP and a corresponding model-based denoising sampler.


Class Incremental Online Streaming Learning

Banerjee, Soumya, Verma, Vinay Kumar, Parag, Toufiq, Singh, Maneesh, Namboodiri, Vinay P.

arXiv.org Artificial Intelligence

A wide variety of methods have been developed to enable lifelong learning in conventional deep neural networks. However, to succeed, these methods require a `batch' of samples to be available and visited multiple times during training. While this works well in a static setting, these methods continue to suffer in a more realistic situation where data arrives in \emph{online streaming manner}. We empirically demonstrate that the performance of current approaches degrades if the input is obtained as a stream of data with the following restrictions: $(i)$ each instance comes one at a time and can be seen only once, and $(ii)$ the input data violates the i.i.d assumption, i.e., there can be a class-based correlation. We propose a novel approach (CIOSL) for the class-incremental learning in an \emph{online streaming setting} to address these challenges. The proposed approach leverages implicit and explicit dual weight regularization and experience replay. The implicit regularization is leveraged via the knowledge distillation, while the explicit regularization incorporates a novel approach for parameter regularization by learning the joint distribution of the buffer replay and the current sample. Also, we propose an efficient online memory replay and replacement buffer strategy that significantly boosts the model's performance. Extensive experiments and ablation on challenging datasets show the efficacy of the proposed method.